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ABSTRACT 


A system which is capable of acting as an adaptive binary 
detector is proposed and analyzed. Exponential smoothing is used for 
estimation of the mean. A technique similar to exponential smoothing is 
used for estimation of the variance. The system uses the frame synchro- 
nization code as a teacher in order to adapt itself to the character- 
istics of the environment. Decision Directed Measurements are used when 
the frame synchronization code is not available. The speed and accuracy 
of the different techniques are derived in this study. The optimum 
location of the initial conditions of the system is also determined. 


CHAPTER I 


INTRODUCTION 

This dissertation examines the feasibility of receiving binary 
digital communication signals with an adaptive detector vhich adjusts 
its threshold in accordance with the need of a slowly varying or previ- 
ously unknown environment. The emphasis will be upon an adaptive 
technique selected primarily for the simplicity of its implementation. 
The technique will be analyzed to show how it offers improvement over 
making no change in the threshold location of an optimum detector. A 
system of this type is needed for use in spacecraft or aircraft systems 
where simplicity and small size are important characteristics of a 
system. 

For the case of a binary system operating in an environment of 
additive white Gaussian noise, the optimum Bayes detector consists of 
two matched filters, a subtracter, and a threshold device. One of the 
two filters is matched to the binary 0 waveform and the other to the 
binary 1 waveform. The received signal is applied simultaneously to 
the two matched filters. A decision, concerning which symbol was trans- 
mitted, is made by comparing the difference of the outputs of the two 
matched filters to a threshold. For many communication systems, condi- 
tions are often such that the optimum location of the threshold is fixed 
and known. However, there are, or may arise, conditions such that the 
optimum location of the threshold depends on parameters which are 
neither constant nor known. Conceivable examples are: (1) noise whose 
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mean or variance is subject to change, and (2) matched filters suffering 
performance deterioration, a plausible condition especially when active 
filters are used. Such situations present the possibility that a Bayes 
detector which is optimum for some specific conditions will have a 
higher probability of error than an adaptive system after a change in 
the environment. 

The proposed system is capable of estimating the optimum location 
of the threshold when the unknown or variable parameter is the nonzero 
mean of the noise. In the case of unequal probability of transmission 
of a binary 0 and binary 1, the proposed system can be used to estimate 
the variance of the noise, which may be the unknown or variable para- 
meter and is necessary for the calculation of the threshold. When 
circuit failure or component drift in one of the matched filters causes 
an optimum detector to locate its threshold at a nonoptimum location, 
the proposed adaptive system is capable of moving the threshold to 
reduce the average difference between the actual threshold and the 
optimum location of cb'' threshold. These situations are discussed in 
detail in Chapter II. 

The adaptive portion of the detector receives as its input the 
difference of the outputs of the two matched filters. It chooses the 
threshold location according to calculations upon past values of its 
input. Estimates are made of the mean of the input to the adaptive 
detector when a binary 0 is transmitted, of the mean when a binary 1 
is transmitted, and of the variance of the input caused by transmission 
of either (but not both) a binary 0 or binary 1. The location of the 
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threshold is calculated from these estimates. Recursive equations are 
used to estimate the mean and variance. This paper proposes use of the 
frame synchronization code as the teacher in a "Learning With Teacher" 
mode and use of a "Decision Directed Measurement" technique when the 
frame synchronization code has not been located. The original contribu- 
tions of this work are (l) the method of estimation of the variance 
(Chapters IV and V) and (2) the determination of the effect of various 
parameters on the convergence rate for the Decision Directed Measurement 
technique operating in conjunction with the estimates of the means 
(Chapter VI). 

In the past few years the literature has contained many reports 
of work on adaptive detectors. Very few of these, however, contain 
material pertinent to the system proposed here. Papers by Abramson 
(Ref. 2) and by Abramson and Braverman (Ref. 5) were among the first to 
deal with adaptive detectors. These papers, concerned with optimal 
learning in a random environment, offer estimation techniques which, 
with modification, are useful in the proposed adaptive detector. A book 
by Hancock and Wintz (Ref. 7) has chapters pertaining to adaptive 
receivers and to learning by Decision Directed Measurement. It is 
generally concerned with optimum estimation methods. In addition, it 
presents results from computer simulations of the various learning 
schemes similar to those of Lindenlaub and Mix (Ref. 3)* Cooper and 
Cooper (Ref. lb) investigate a system using learning without super- 
vision and estimated means. Groginsky, Wilson, Middleton, Hancock, 
Gregg, Millard, and Kurz (Refs. 15 - 17) have described work which has 
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been performed on adaptive detectors for operation in noise of unknown 
distribution. The present work is concerned with analyzing in detail a 
specific, simply implemented, method of adapting the threshold in an 
environment of white, Gaussian noise. Since the resulting system is 
suboptimum, the available work on optimum systems must be modified and 
extended . 

The mean is estimated by a technique which was mentioned, but not 
analyzed, by Abramson and Braverman (Ref. 5) for the purpose of tracking 
a slowly changing parameter. The technique is similar to Kalman 
filtering (Ref. 18) except that it is designed on the basis of other 
than minimum mean-square error. Abramson (Ref. 2) gives an analysis of 
the recursive estimator of the mean for minimum mean-square error. 
Lindenlaub and Mix (Ref. 3) give the appropriate coefficients for the 
recursive equation in order to get minimum mean-square error for three 
specific autocorrelation functions of the slowly changing parameter. 

Lin and Yau (Ref. 19) discuss the Bayesian approach to the estimators 
and are concerned with minimum risk. Beine (Ref. 20) discusses an RC 
averager to estimate the mean which corresponds to the recursive 
equation used here. The most complete analysis of the recursive 
estimator is given by Brown (Ref. 4) and his analysis is used as the 
basis for the present work. 

Dale (Ref. 21) discusses an estimation of the variance by a sum 
of squares. Books by Deutsch and Good (Refs. 22 and 23) treat the area 
of estimation theory. No appropriate references were found on estima- 
tion of the variance under the requirements of the system being 


5 


investigated. This dissertation shows how an estimate of the variance 
can he found by modifying a method used by Brown (Ref. 4) to estimate 
means. The method will be analyzed to derive its accuracy. 

A search of the literature for an analysis of a system using 
Decision Directed Measurement techniques in conjunction with a recursive 
estimator revealed none completely suited for detailed evaluation of the 
system proposed here. Lindenlaub and Mix (Ref. 5 ) and Hancock and Mix 
(Ref. 10) use Monte Carlo methods to check the convergence of several 
learning methods. Henry Scudder (Ref. 6 ) derives the asymptotic 
probability of error for a DIM system which makes estimates only on the 
binary 1 signals. Patrick and Costello (Ref. 1) derive the asymptotic 
probability of error for estimates on both signals but use the sample 
mean with an infinite number of samples as the estimator. The present 
work develops a computer program (Chapter VI) for performing a numerical 
convolution of the probability densities involved in the system and 
makes possible the investigation of how system operation is affected 
by various parameters such as signal-to-noise ratio and initial, 
estimates of the means. 

Spragins (Ref. 24) presents a review of the methods of "Learning 
Without Teacher" and points out that an optimum method of "Learning 
Without Teacher" is impractical. Ihe present system is offered 
(Chapter II), from among the many suboptimum solutions to the problem, 
as one which is practical and simple to implement. 

No construction of hardware has been performed as a part of this 
study. Computer simulation has been used for any work requiring 
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investigation from an experimental viewpoint. The computer has also 
"been used as an aid in analyzing some areas which were difficult to 
evaluate in closed form. Since it is anticipated that this system would 
operate in rear lime and might possibly be used in a spacecraft or air- 
craft, it is desirable that the system be potentially fast and small. 
This requirement has influenced the method of operation selected for the 
system. 

At the present time, there is no universally accepted test for 
determining whether a system is an adaptive system or a learning system. 
The system being investigated here probably fits under most definitions 
of an adaptive system rather than that of a learning system. This 
system has a specific procedure for adjusting the threshold as a 
function of the past history of the input signal and does not attempt 
to recognize situations that it has previously encountered. 


CHAPTER II 


DISCUSSION OP TOTAL SYSTEM 

A detector that is optimum in the Bayes sense for binary signals 
in additive, white, Gaussian noise (Ref. 7> P* 49) is shown in Figure 1, 
where 

w(t) = input to the detector = A c s^(t) + n(t) 

A c = channel gain 

s 1 (t) = signal representing a binary 1 
s (t) = signal representing a binary 0 
n(t) = noise 
b = optimum bias 

This optimum detector decides that a binary 1 was sent if u ^ 0 and 
that a binary 0 was sent if u < 0 by use of 

u = - Sj - b (2-1) 

where 

a 2 

b = -S_ 2n K + i A c (s i T S 1 . S 0 T S 0 ) + 5 T (S 1 - S„) (2-2) 

c 

In these equations the capital letters are matrices (Ref. 7, 
pp. 231-243 ) representing the functions of time shown in Figure 1. The 
superscript T indicates the transpose of the matrix. The factor K 
is a decision boundary which appears in the Bayes calculations and is 
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Figure X.- Structure of binary detector 
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K = 



for the cost of an error in detecting a "binary 1 equal to the cost of an 
error in detecting a binary 0 where is the probability of trans- 

mitting a binary 0. The mean of the noise is represented by N, and 
a n ^ ib the variance of the noise. It is consistent with (2 - l) to 
say that the detector decides that a binary 1 or binary 0 was trans- 
mitted by determining if W T ^S 1 - S Q ) is greater than or less than the 
threshold, b. The adaptive detector employs as its threshold an 
estimation, 0, of the optimum bias based on past history of the 
difference 

* ' « T ( S 1 - s o) 


The estimated threshold, 0, is obtained by first averaging the estimate 
of the mean of X when a binary 1 is transmitted and the estimate of 
the mean when a binary 0 is transmitted, and then adding the ratio 
a2 

— — (the estimated variance of the noise divided by the estimated 


channel gain) multiplied by In K; thus 



A c S l + H} + E{X|W - A C S 0 




In K 


(2-5) 


The only characteristics of the system which must be known are K, the 
waveforms s^(t) and s Q (t), suad the fact that the noise is additive, 
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white and Gaussian. The mean and variance of the noise and the channel 
gain can all be unknown or slowly-varying parameters. 


In order to show that 3 corresponds to the optimum bias for 
perfect estimates of the means and variance, the expression for 3 
(2-3) is rearranged to yield 



S 1 - k c S l\ + ■ ^ s o * A c S l\ 


I [‘A’ 

> -*'•]* 7 


- A C S 0 S Q 


-T 

+ NS 


l A c 


In K 


- S o\] + - S o) + \ 


In K 


(2-h) 


which agrees with (2-2). Since the estimations are not perfect, the 
adaptive detector is actually a suboptimum receiver. 


The ratio 
estimated by 



in equation (2-4) is shown in Appendix I to be 



The method of estimation of the means and of the variance are discussed 
in Chapters III and IV, respectively. The above analysis shows that the 
adaptive detector derives an estimate of the optimum location of the 
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threshold when the channel gain and mean or variance of the noise are 
unknown. The value of the decision boundary K must be known for this 
situation. 

For the special case of K = 1, the adaptive detector can also be 
used to increase the reliability of the receiver by making it possible 
to relocate the threshold in the event of a degradation in one of the 
matched filters in an optimum detector. The degradation would cause the 
optimum detector to be operating with the threshold at a nonoptimum 
location. The equation implemented by the degraded optimum detector is 

u = W T (S X + e - S Q ) - b 

where e is the change in one of the matched filters. The threshold of 
the degraded optimum detector differs from the optimum location by an 
amount, d, given by 

d = W T e * + N ) T £ ; i * 0, 1 (2-6) 

In order to restore the threshold to the optimum location, it is 
necessary to determine e and subtract it from the term + € - S Q j. 

The adaptive system can be used to improve this situation when it is 
not practical or possible to determine e and to make the necessary 
adjustments. This specific adaptive system is limited to situations 
of K = 1 for degradations in the matched filters because the estima- 
tion techniques used here do not give an accurate estimate of the 
variance when the input is degraded. Other estimation methods may 
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permit this system to be used for K f 1, but they have not been 
investigated here. 

The following analysis is included to show that the adaptive 
detector locates its threshold at 

3 = b + E {d} 

for perfect estimations of the means. For calculation of 3 according 
to (2-5), with K = 1, the input is 

x = w T (s 1 + € - s 0 ) 

and the estimates of the means are given by 
B<X|W = A C S 1 + N) = E{(A C S 1 + ») T (S x + € - S 0 )} 

- A c S 1 Ts i - A A\ + ^ ( S 1 S o) + ( A c S l + (2 - 7a) 

e{x|M = A 0 S 0 + n} = e{(A c S 0 + Nf (S x + e - S„)} 

- A = S o Ts i - A c S o\ * ^ ( S 1 " S o) + ( A c S o + (2 ‘ 7t) 

The threshold according to (2-5) is 

U ’ \ A =( S 1 T S 1 ' S o T S o) S 0 ) + | + A C S C + 2B] T e 

(2-8) 

This differs from the optimum threshold given by (2-2) for K = 1 by 
the amount of the last term which is shown to be the expected value of 
d, the distance to the optimum location of the threshold. For equally 
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probable signals, K = 1, the expected value of d is obtained from 
(2-6) to yield 


E ( a ) * I KiVi * ») T< } 


e/(A 

\lco 


n)- 




A c S l + 


AS 
c o 


+ 2 ! 




This corresponds to the last term of equation (2-8). 

The degraded optimum detector implements 

u = W T (S 1 - S Q ) - b + d 

The adaptive detector, with perfect estimation, implements 

u = W T (S 1 - S Q ) - b + d - E^d} 

Since d - E'yd) is less than d on the average, the adaptive detector 
is closer to the optimum location of the threshold than the degraded 
optimum detector. This results in a suboptimum detector but would offer 
improvement for sufficiently large values of d over the continued use 
of the degraded optimum detector. 

The input to the adaptive detector has been called X, which 
represents a matrix. In practice, the input is the sampled output of 
the subtracter at the end of a bit time. The input to the adaptive 
detector is, therefore a sequence of values, x^j each value represents 
the processing of a single bit by the matched filters. Bit synchroni- 
zation i6 assumed for this study. 

For proper operation of the adaptive detector, it it necessary 
to perform separate estimates of the mean of x when a binary 0 has 
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been transmitted, and of the mean of x when a binary 1 has been 
transmitted. For this to be accomplished, it is necessary to know 
whether the transmitted signal was intended to be a binary 0 or binary 1 
in order to know which estimator to update. 

This system operates on digital communication systems in which 
the data are transmitted in serial fashion over a single channel. In 
a system of this type a known sequence of bits, called the frame 
synchronization code, is normally inserted into the sequence of data 
bits in order to synchronize the decoder located at the receiver with 
the encoder located at the transmitter. After synchronization has been 
obtained, the proposed system uses the fact that during transmission of 
the frame synchronization code, the correct decision is known. The 
system knows from the synchronization code if the received signal was 
intended to be a binary 0 or a binary 1 and updates the appropriate 
estimator. This is a form of Abramson's "Learning With Teacher" 

(P.ef. 2) where the frame synchronization code is the teacher. 

Since the "Learning With Teacher" scheme cannot be used during 
the transmission of actual data, the system may either cease its 
estimation until the appearance of the next synchronization code or 
may use some form of "Learning Without Teacher. " The operation of the 
system before the synchronization code has been located also requires 
the use of "Learning Without Teacher" since the correct decision is not 
known. The proposed system uses a Decision Directed Measurement (DDM) 
technique similar to that discussed in Reference 3> page 13* In DDM, 
a decision is made with all available information and assumed to be 
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correct. The decision determines which estimator is to be updated. 
Incorrect decisions are possible in DDM and cause the system to converge 
slower than a "Learning With Teacher" system. There are other tech- 
niques of "Learning Without Teacher, " but the DDM technique appears to 
offer the best combination of convergence rate and implementation 
simplicity as shown by Lindenlaub and Mix (Ref. 3, pp. 13-37). 

The DDM technique operates in such a manner that the expected 
value of the estimate of the mean when a binary 1 is transmitted moves 
to the mean of all signals above the threshold. Ibis estimate of the 
mean is not unbiased when a binary 1 is transmitxed because some of the 
signals above the threshold are due to the transmissior of a binary 0 
and some of the signals due to the transmission of a binary 1 fe 1 ! 
below the threshold. Therefore, it is necessary to use the DDM 
technique only to move the threshold so that enough correct decisions 
can be made to enable the frame synchronization code to be located. The 
DDM scheme should not be used after the frame synchronization code is 
located. The "Learning With Teacher" scheme is required to give an 
unbiased estimate of the conditional means. 

The analysis of the estimation techniques can be performed 
without knowing if the inputs are coming from the "Learning With 
Teacher" scheme or from the DDM scheme. The estimators are only 
required to perform computations on the data given to them. The 
"Learning With Teacher" or DDM technique performs the function of 
deciding which estimator receives each individual input sample. The 
convergence and accuracy of the estimations are derived as functions 



16 


of the input data, and this analysis applies for "both the "Learning 
With Teacher" and DDM cases. 

Three specific problems associated with this adaptive detector 
are analyzed and are discussed here. These three are: 

1. Estimation of the Mean 

2 . Estimation of the Variance 

3- Operation of the system when controlled by the Decision 
Directed Measurement technique. 

Both the estimation of the mean and the estimation of variance are 
performed by recursive equations. The estimate of the mean is 
accomplished by use of exponential smoothing (Chapter III). A technique 
similar to the exponential smoothing method is used for the estimate 
of the variance (Chapter IV). Computer simulation is used to study the 
operation of the Decision Directed Measurement technique and the 
estimation methods (Chapter VI). 

Figure 2 shows a block diagram of the system discussed here. The 
input to this system is a sequence of random values; each value 
represents W - S Q ^ for a single bit. The threshold decision 
element examines each random value. If the random value is above the 
threshold, the output is the decision that the bit is a binary 1. If 
the value is below the threshold, the output is a binary 0. The 
threshold computer determines an estimated value of the threshold by 
calculating the terms in (2-3) from the two estimates of the means and 
from the estimate of the variance. The resulting value of the threshold 
is furnished to the threshold decision element. This is equivalent to 
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Figure 2.- Adaptive binary detector. 
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estimating the bias, b (2-2). The estimators use the input data to 
make estimates of three of its properties. The control block determines 
which estimators are updated. The sync detector looks for the frame 
synchronization code in the output of the threshold decision element. 

If the sync detector has properly located the frame synchronization 
code, the control block is directed to use the frame synchronization 
code to determine which set of estimators to update. If the sync 
detector has not located the frame synchronization code, the control 
block is directed to use the decisions of the threshold decision element 
to determine which set of estimators to update. This latter case is 
called Decision Directed Measurement. 

The properties of interest in this investigation are accuracy, 
speed of response to step changes in the input, complexity of equipment 
required to implement the system, and required calculation time. 
Techniques have been selected which appear to require simple implementa- 
tion and which have potentially low calculation times. In this investi- 
gation, the accuracy and speed of response are found to be variables 
which must be traded off against each other. 



CHAPTER III 


ESTIMATION OF THE MEAN 
Description of the Method 

One of the tasks which the adaptive detector must perform is the 
estimation of the mean of the received signals representing a binary 0 
and representing a binary 1. These signals are (A C S 0 + n]^ - S Q ^ 

and (AcS-^ + N^ - S Q j, respectively. For this reason, there are 
two estimators of the mean in the adaptive detector; logic circuits 
determine which of the two is updated. The estimators are identical, 
hence, an analysis of one can be extended to the other. The input to 
the estimator is a sequence of values which represents the received 
data. This sequence of values consists cf the differences of the out- 
puts of the two matched filters at the ends of transmissions of 
successive bits. As previously mentioned, the technique for estimation 
should be simple, accurate, and capable of reacting quickly to abrupt 
changes in the characteristics of the input. 

The technique selected for estimation of the mean is exponential 
smoothing which was introduced and analyzed by Brown (Ref. 4). He uses 
the following equation for the estimation of the mean with a written 
for (l - A): 

% = Ax k _ 1 + (1 - A)^; k = 1,2,3 (3-1) 

where 
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x k = kth estimate of the mean 

x^ = kth data sample 

A = recursive constant; 0 < A < 1.0 


The input to the estimation equation is a series of values, x^, which 
are considered to he samples of a random function. The task of the 
estimator is to estimate the mean, E^Xy>, of the random function, where 
the mean is defined hy 

E<^xJ> = / xp(x) dx 

d -oo 

where p(x) is the probability density of x. Requirements for use of 
the above definition of the mean are given on page 6k of Reference 8. 

The estimation begins with an initial guess, x , of the mean of 

o 

x. This value is used in conjunction with the first data point, x-^, 
to compute the next estimate, x^, of the mean. This process is 
continued as each succeeding input sample is applied to the estimation 
equation. Calculations are simple and quickly made because only two 
multiplications and one addition are required. Storage is needed only 
for 

Analysis of the Method 

Since the input, x^, to the estimator is a random variable, the 
estimated mean also is a random variable. The mean and variance of the 
estimated mean are used to determine the accuracy of the estimation. 

For proper operation of the estimator, the mean of x^ should be 
asymptotically unbiased (Ref. 8, p. 463), that is, the mean of x^ 
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should approach the actual mean of the input as the number of samples 
handled approaches infinity. The variance of x^ is an indication of 
the error of the estimate and should be as small as possible. Unfortu- 
nately, small variance of the estimated mean is achieved at the expense 
of reaction time to abrupt changes in the mean of the input as will be 
shown in Figure J. 

The mean of x^ is derived as a function of k in order to 
show that the estimation is asymptotically unbiased. For a stationary 
input, Appendix II shows that the mean of x^ is 

- A k x q + a(l - A) (l + A^ + • • • + A k_1 ) 
where a is the mean of the input data, x^. Since f A | < 1.0, 

lim A k x =0 

O 


and 


lim (l + A + P? + 

k -» oo 



Therefore, 


lim 

k -» oo 



(3-2) 


This demonstrates that the estimation of the mean by the exponential 
smoothing method is asymptotically unbiased. For nonstationary inputs, 
the analysis of the estimation technique still applies if the input 
changes very slowly with respect to the response time of the estimation. 
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The variance of x^ is also derived in Appendix II and is found 

to be 


variance of 


\ - (1 - A) 2 o 2 £ (a 2 ) 1 
i=o 


2 

where a is the variance of the input, x^. The limiting value of the 
variance is found to be 

lim variance of x = (l - 4)^ a ^ — = ( \ a ^ (3-3) 

k->oo 1-A^'l + A/ 


Some observations can be made at this point. Since the estima- 
tion equation is linear, the estimate of the mean is Gaussian if the 
input data are Gaussian (Ref. 9). The estimation technique is not 
limited to input data with a Gaussian distribution and should be able 
to operate on any distribution for which the mean exists. However, the 
probability density of x^ would be very difficult to calculate for 
distributions other than Gaussian. The Cauchy distribution is an 
example of a distribution which could not be used here since none of 
its moments exist (Ref. 8, p. 157). If the characteristics of the data 
are not time-varying and if the estimation began with an initial 
estimate x q , the actual variance of x^ is always less than the 
asymptotic value. This can be seen from the fact that only positive 
terms are added to the variance as k increases. The limiting value 
of the variance of x^ can be made as small as desired by making A 
closer to 1.0. Since the variance of the estimate is an indication of 


the error, the estimator can be made as accurate as desired. 
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In order to determine the speed of response to a step input, it 
is necessary to determine the transfer function of the estimation 
technique. The transfer function is determined in Appendix II and is 
found to be 

H(z) = (1 ~ - - i l 
z - A 

The time constant associated with this transfer function is 

t =^3L 

c In A 

The time constant also can be expressed as the number of samples 

-1 

n. = 

8 In A 

This value, n g , is positive since 0 < A < 1.0 which means that the 
logarithm of A is negative. It has been shown in Equation (3-3) that 
A should be as near 1.0 as possible in order to reduce the variance 
of the estimate. However, A should be near zero in order to reduce 
the time required to respond to a change in the input characteristics. 

A potential user of this system is required to make a tradeoff study 
in order to determine the optimum value of A for his particular 
application. The variance of and the time constant, n g , are 

plotted in Figure 3 to aid the user in his selection of A. 

An example which illustrates one procedure for selecting A 
follows: Due to the application of an adaptive system, it is required 

that the estimate of the mean be able to react in not more than 100 



2k 



100 


10 


1.0 


0.1 


Figure 3.- Variance and time constant of the estimation of the mean 


TIME CONSTANT, n 8 (NUMBER OF SAMPLES) 
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samples to a step change in the mean of the data coming into the 
estimator. The accuracy of the estimator should remain as high as 
possible. After five time constants, the mean of the estimated mean has 
moved to within 99 per cent of the final value. This means that the 
estimator should have a time constant of 20 samples. Choosing A = 0.95 
gives a time constant of 19.5. For this choice of A, the asymptotic 
variance of the estimated mean is 0.0256 times the variance of the 
input data. 

Two other techniques which were considered for estimation of the 

mean are a sliding window and a running calculation of the sample mean 

of all previous samples. The running calculation of the sample mean 

is a calculation of the sample mean using all previous samples. It is 

not practical since it must take into account the number of previous 

samples, which could possibly exceed the capacity of the computer used 

for the computation during long periods of operation. It is also slow 

to react to changes in the data characteristics if the number of 

previous samples is very large. The sliding window method uses a fixed 

number of samples and computes their sample mean. The variance of the 
«2 2 

sample mean is — (Ref. 11, p. 2^6) where <r is the variance of the 
n 

input and n is the number of samples in the window. In order to 
have the same variance of estimated mean for exponential smoothing. 
Equation (3-3), and a sliding window, 
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and 


n 


_ It A 
1 - A 


From consideration of the general range of accuracy and speed require- 
ments , i will probably be 


0.9 < A < 1.0 


The following table shows the value of n required as a function of A 
in order to enable the sliding window system to have a variance equal 


to that of the recursive system: 


A 

n s 

n 

0.9 

9.491 

19 

• 95 

19.497 

59 

• 99 

99.502 

199 

• 999 

1000 

1999 


The table also shows the corresponding time constant, n s , of the 

recursive equation. All effects of the previous characteristics of 

the input disappear from the sliding window technique when n samples 

have been processed after the step change; meanwhile the exponential 

smoothing technique has undergone approximately two time constants. 

If the initial estimate for both estimators is zero and the final 

estimate is 1.0, the expected value of the recursive estimate moves as 
( -t/t c \ 

^1 - e ' The expected value of the sample mean of the sliding 

window moves linearily between zero and 1.0. For the recursive 
estimator, the integral of the difference between the final value and 


the actual value is 
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ERROR = 





dt 



= t 


c 


dt 


The same calculation for the sample mean of a sliding window is 


ERROR 



= t 



2t 


c 


o 


- 2t c - t c 



This shows that the integral of the difference "between the final value 
and the actual expected value of the mean is the same for the recursive 
estimator and the sample mean of a sliding window. However, the amount 
of equipment required for implementation of a sliding window technique 
due to the requirement of storing and labeling hundreds or thousands of 
previous samples removed the sliding window technique from further 
consideration in this application. 

For updating the estimators during the frame synchronization 
code, it may be practical to use the sample mean of the synchronization 
code block as the estimator. However, this is not advantageous when 
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operating in the DDM mode because if blocks of n samples are used in 
this mode to calculate the sample mean, the system must use the old 
estimate for n samples while waiting for a new estimate. This results 
in additional error as will be shown below. The recursive estimator 
with variance equal to that of the sample mean has moved two time 
constants closer to the new location than the sample mean during the n 
samples. The expected value of the sample mean of a block of n 
samples remains at zero for a time equal to 2t c and then jumps to 1.0. 
The integral of the difference for the sample mean of a block of n 
samples is 

2t c 

ERROR = I (1 - 0) dt 

J o 
= 2t c 

This shows that the use of blocks of n samples yields more error than 
use of either a recursive estimator or a sliding window. The same 
results are obtained for any location of the step change with respect 
to the location of the block of n samples. The recursive estimator 
has an additional advantage over the calculation of the sample mean 
since the accuracy of the recursive estimator can be changed by simply 
changing a single constant. A change in the number of samples used is 
required to change the accuracy of the sample mean. 




CHAPTER IV 


ESTIMATION OF THE VARIANCE 


A technique similar to the estimation of the mean is used for the 
estimation of the variance. The equation used is 



(4-1) 


where 



is the kth estimate of the variance 
is the kth estimate of the mean 
is the kth data sample 
is a constant and B < 1.0 
is a constant 


/v 2 

This equation uses an initial estimate of the variance, 8 , plus the 

received data sample and the present estimate of the mean in order to 
make a new estimate of the variance. The constant C will be 
determined later and is required to make this technique converge to 
the proper value, that is, to remove the bias of this estimate. 

If is replaced by its equivalent given by (3-1 )> a more 

p 

usable form of (4-1) is obtained for 8^ : 


s * 2 = BS k.i 2 + - *k + Kf 


- BS k-i 2 + \S - ViJ 


(k-2) 
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^ p 

This equation for leads to faster calculation than the preceding 

one since it employs x^ ^ instead of 
performed in parallel with the kth estimate of the mean instead of 
having to wait until x^ is calculated. 

As was done in the case of the estimate of the mean, the mean 
and variance of the estimate of the variance are determined. The 
constant, C, will he selected to force the mean of the estimate of the 
variance to converge to the actual variance of the data being sampled. 
Hie variance of the estimate serves as an indication of the average 
error of the estimate. 

The mean of the estimated variance is calculated for several 
values of k. Enough terms are used in order to recognize the series 
being generated. The general expression is then written, and the 
limiting value is determined. Thus, 


x^. This calculation can be 



and 
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The same techniques are used to calculate E^, 2 ^, which g iveE 


h s ■ < * %^[ x 2 - ■ *k 2 ♦ K - *o) 


+ A2 ^ X c ~ B> (*2 * Ax o - (1 - A )xJ 2 

. b 2 3 2 + A g Bli -.B). r a . 2 s ,^l t a 2 (i - b) r a + 2 

o C TL looj C 12 o 

+ (l - A) 2 x 1 2 - 2 Ax 2 x q - 2(1 - A)x 1 x 2 + 2A(l - A)x^xjj 


and 


E { 3 /} = b\ 2 * B) f^ 2 ) . 2x o E^ 


+ X 


A2(l - B) 


E^ 2 } + A 2 x q 2 + (1 - A) 2 E<^x 1 2 ^> - 2Ax o E<Jx 2 >> 


- 2(1 - AjE^x^ + 2A(1 - A)x q E^ 

The data samples are considered to be independent so that 

E { x i x j} * E { x i} E ( x j} for 1 1 •> 

This yields 

E<S /> = B 2 3 2 + r a 2 + £ . 2ai + S 2 1 + a2(1 - B) 

v. 2 Jo qL- O O q 

+ a 2 + A 2 £ 2 + (l - A) 2 (a 2 + c 2 ) - 2Aax - 2(l - A)a 2 
o v 0 

+ 2A(l - A)ax 
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- B\ 2 + %-B) £ (1 + b)o2 + (1 . A) V 

+ (a 2 + b) (a - SE 0 f] 

If these same techniques are used, the mean value of 
can be determined; 


a 2 and 
3 



E{o 2 y = B 5 3 0 2 + a2 - (^- ' - B .) (l + B + B 2 )a 2 + (l + A 2 + b)(1 - A ) 2 a 2 


+ (a^ + A 2 B + B 2 )(a - xj‘ 


E(a, 2 } = B^CT q 2 + a2 ^ c " E ) (l + B + B 2 
+ b(i + A 2 ) + B 2 ^] (1 - A ) 2 o 2 + 

+ A 2 B 2 + B^) - X Q j 2 


+ B5)a 2 + [(l + A 2 + A^) 
(A 6 + A k B 


From these four mean values it is possible to recognize the general term 
of this series as 


E & 2 > B V ♦ 


i si + ( a - *o f i A2iBk ' 1 ' 1 


i=o 

k-2-J 


I^o 


k-2 / k-2- 

+ (l - A)V I r 1 

j=o V j=c 


i2i 


(4-3) 


The next problem is to find the value of 
infinity. Since | B j is less them 1.0, 



k approaches 
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11m 

k -♦ oo 



= 0 


and 



The limiting value of the next term in (4-3) is found from 


k-1 


(* - *o) 2 X A 21 ®*' 1 ' 1 ■ ( a - *o) : 


i=o 


B k-1 + A 2 B k " 2 + AS k “ 5 


+ • • • + A' 


2k- 2 


It is known that 


0 < A < 1 
0 < B < 1 


Let 

A < C 2 < 1 
B < C 2 < 1 

If C 2 is substituted for A and B in the above series, the 
resulting series is greater term by term than the original series 
involving A and B. If the limiting value of the series of C 2 iB 
shown to approach zero, the limiting value of the series of A and B 
must also approach zero. Thus, 

k-1 k-1 

^ c 2 21 = c 2 k_1 Cg 1 

i=o 


so 
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This is a truncated geometric series whose partial sum, s k (Ref. 15 ), 
is 


s k 


P k-1 

c 2 



r k-1 „ 2k-l 

^ 2 * ^2 


1 - Cg 1 - Cg 


Since C 2 < 1, 


Therefore, 


lim s k = 0 
k -* 00 


lim [a 

k -* 00 ' 


k-1 

x Q ^ 2 ^ A 2i B k_1-i = 0 
i=o 


The last term in (4-5) is 


k-2 / k-2-J 


(l - a) 2 0 2 B J ^ A 21 = (1 - A) 2 a 2 jj(l + A 2 + A^ + 


J»o V i=o 


+ A 2k_1+ ) + B^l + A 2 + A^ + 

+ A 2k " 8 ) + B 2 (l + A 2 + A* 4- + • • * 

+ A 2k ' 8 ) + • • • + B k_5 (l + A 2 ) + B k - 2 ] 


The limiting value is 


k-2 / k-2- j 

lim (1 - A) 2 o 2 £ |b J A 2i ) = (1 - A) 2 o 2 (l + A 2 + A 4 

j^o V. i=o 


+ . . . ) (l + B + B 2 + • • •) 
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= -,{ l -=.A ) 2 q 2 

(l - A 2 )(l - B) 


= C 1 - A )g 2 

(1 + A) (l - B) 


These expressions are inserted into the equation for E 
in order to find the limit as k approaches infinity. Thus, 



lim 

k — > oo 



(1 - A)o 2 1 
(1 + A) (l - B) J 


_ A 2 q2 
C 




~ 2A 2 
1 + A 


In order for this limit to converge to the actual variance, c , 
of the function "being sampled, we must have 


C = 


2A 

1 + A 


The value of C obtained above is inserted into the estimation 
equation (4-2) to give 




(1 - B)(l + A) 



(k-k) 



CHAPTER V 


DERIVATION OF THE VARIANCE OF THE ESTIMATED VARIANCE 

The variance of the estima+ed variance is also of interest since 
it gives an indication of the error of the estimate. Due to the 
complexity of the procedure of calculating the variance of the estimated 
variance for the general case, the derivation is performed here only for 
input data consisting of samples taken from a Gaussian distribution with 
mean of "a" and variance of "o 2 ". However, the technique of estimation 
described in Chapter TV is not limited to this case; it applies to any 
probability distribution whose mean and variance exist. If the moments 
of a variable are expressed in terms of the mean and variance of the 
variable, it is found that moments of order greater than two are 
dependent on the probability distribution of the variable. The 
variance of the estimated variance is a function of the probability 
distribution since it involves moments of order greater than two. The 
moments of a Gaussian variable are shown in Appendix VII. 

Since the equation used for estimation of the mean is a linear 
equation, the estimated mean has a Gaussian distribution if the data 
have a Gaussian distribution. The term, (x k - is the difference 

of two terms, each of which has a Gaussian probability distribution. 

The probability of the difference is also Gaussian. The square of this 
difference has a chi-square distribution with one degree of freedom 
(Ref. 11 , pp. 250 - 253 ). 
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In order to investigate the probability distribution of the 
estimated variance it is necessary to examine several estimation steps 
using 



- 2 (1 - B) (1 + A) 

Bo k-l + 


- Vi ] 2 


A 2 

The initial guess, o q , has a delta function for a probability distribu- 

A 2 

tion since it can have only one value. The distribution of is the 

weighted convolution of a delta function and a chi-square distribution 

2 

with its origin shifted. The equation for a ^ is a weighted sum of 

c^ and - xj . Because of the estimation technique x^ is not 

A g A 2 

independent of c^ and is fixed exactly when a ^ is determined. 

2 a 

However, is independent of either c^ or x^. The distribution of 

A 2 

02 is a weighted convolution of the chi-square distribution represent- 
ing Oj 2 and the distribution *■[(*2 ‘ i l) 2 | 3 l 2 ] , which is a chi-square 

A 2 

distribution with its mean a function of . The probability distri- 
bution of any estimate of the variance by this recursive equation is a 
weighted convolution of the distribution of the previous estimate and 
a chi-square distribution whose mean is determined by the previous 
estimate of the variance. The probability distribution of the estimate 
of the variance is not determined since it is not practical to make a 
detailed calculation. Although the distribution of the estimate of the 
variance is not derived, its variance serves as a indication of the 
error of the estimate. The error decreases as the variance decreases. 
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Using the definition of variance, the variance of a k 2 is 
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= B 2 ^variance of ^ 2 j 


+ B(l - B)(l +A) 


^ 2~ 2 „ ^ - 2 
E K Vi - 2 WiVi 


2- 2 


+ Vi Vi ■ E - Vi) E r: 


k-l 


, (i - b) 2 (i + a ) 2 r . , ( - ^ 

+ L- variance of (^ - x^J 


= B 2 ^variance of ^ 2 j 


+ (L^i 


^ + ^ variance of ^x^ - x^ ^J 2 


+ B(1 - B)(l + A) (a 2 + a 2 ) E {\_i^ 


- 2a E 


- E 


(vA-ij* E ^k-i\-i 2 } 

{(*k - Vi)} e ^-4. 


(5-1) 


This last step can be made since a ^ ^ and x^ ^ are independent of 

V 

Let k = n where n is large enough so that all terms in the 
equation for ^variance of a^ 2 j except ^variance of ^ 2 j nave 
become infinitesimally close to their limiting values. The convergence 
of each of these terms is shown in the appendices by the derivation of 
their limiting values. Ir Appendix III these are 
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lim variance 
k -»» 


r ^ \2 8 o 

^' Vl ^ ‘ (TTa? 


(5-2) 


and 


k ^E(V\ 


\2) 

-ly j i + 


Appendix IV shows that 


lim E 
k -»» 


{viVi} * 


an 


Appendix V shows that 


(5-5) 


(5-M 


lim E 
k -» 00 


£ 2- 2l 22T1-AJ1- A) 2 (l - B 

{vivij- a « : a £ 


- B) ' 

b) 


a (5-5) 


By the choice of C in Equation (4-2) we have insured that 

2 


lim E 
k -» 00 


M ■ ^ 


(5-6) 


Substitution of Equations (5-2) - (5-6) into Equation (5~l) yields 


, „ - 2 „2 ( , „ - 2\ . (l - B) 2 (l + A Y 

variance of a n = B (variance of a ri i J ^ ~T± 


8 a 


(1 + A)' 


+ B(l - B) (l +A) 


(a 2 + o 2 ) a 2 - 2a (aa 2 ) + a 2 a 2 


. 1 - A (1 - A) c (l - B) \ 4 

<i + a)(i-aV> u + a 


2c 2 \ 2 


(5-7) 


= B 2 ^variance of ^ 2 J + 2(1- B) 2 c^ 



4i 


+ B(l - B)(l + A) 


22,4 . 2 2 2 2 

aa + a - 2a a + ao 


n - a\ 

1 0 4 + (1 - A)i 

"(1 - B) 

_4 20^ 

U + aJ 

(1 + A)l 

[l - A 2 bJ 

| 0 1 + A J 


2 4 


= B 2 ^variance of 0 ^ ^ j +2(1- B) 2 <j 


+ B(l - B) (l + A) 




(S-H) 


(1 - A) 2 (l - B) u 2a 4 
(1 ♦ A)(l - A S B) ‘ 1+A - 


= B 2 ^variance of ^ 2 j + 2(l - B ) 2 0 

+ B(1 - B) (l + A) [~rj + (1 ~ 

L 1 + A (1 + A)(l - A 2 b) 


2o 


1 + A 


2 4 


= B 2 ^variance of 0 n -^) + 2 (l - B ) 2 0 


B(1 - B) 2 (l - A) 2 4 
2 

1 - A B 


The method of determining the limiting value of variance of a 


n 


insert some constant, M, for ^variance of Several terms 


(5-8) 

is to 

are 


determined in order to recognize the series being generated. Thus, 



^ 2 

variance of a , ' = M 
n-i 


variance of o 2 = B 2 M+(l- B) 2 
n ' ' 


B(1 - A) 2 + 
- 1 - A 2 B 


variance of a 2 = B^ M + (l + B 2 ) (l - B) 2 B ^ - 1 + 2 

n ’ ^ L 1 - A B 


variance of = B M + ^l + B^ + B 


2 -% - b) 2 


b (b - *)' % a 

. 1 - a 2 b j 


The general term is 


variance of 0 , 2 = g^O+l) 
n+j 


r 2 n A 

M + (1 - B) 2 B ^ + 2] a 1 *' V 

Li-a 2 b J £, 


B 


.21 


The limiting value is 

lim (variance of a 
j _>«» V n+j 


Since 


.V« 


lim B 

j ■*+ 00 

+ (1 - B)‘ 


,2(j+l) 


sii_=*2f +3i*- liB f 

L 1 - A B J j -» oo 


B 


,2i 


J 


lim y B 2i . — i — * for J B | < 1.0 
J — to 1 - B 


lim B 2 ^ +1 > = 0 for 1 B I <1.0 


lim (variance of 0 , 2 ) = (l - B)‘ 
j • V n+j J 


+2 

1 - A B J 




1 - B 


^3 


lim variance of 0 , = lim [variance of a 

j 1 *' j v n+J 




No attempt is made to apply the standard mathematical, tests for 
convergence due to the complexity of the series. The method of 
derivation used above shows the convergence of the series since the 
starting point has no effect on the limiting value of the sequence and 
the limiting value is determined. 

A calculation which adds to the credibility of this derivation 
is that of the estimation of the variance when the mean is known 
exactly. For this case A is equal to 1.0 and the equation for the 
variance of the estimated variance reduces to 


lim [variance of 0 
J ->“> V 

This can be checked by actually calculating the mean and variance of the 
estimated variance with the mean known exactly. Thus, 


i) - (rrl) 2 ° k for A - i - 0 ( 5 - 10) 


\ 2 - B Vi 2 + d - « (\ - a ) 2 (5 - 11) 



and 



E = B 2 a Q 2 + (l + B) (1 - B)a 2 

E fa ? 2 J = B 5 a o 2 + (1 + B + B 2 )(l - B)o 2 
The general term is 

M k-1 

_kA 2 t _ _ v 2 \ 1 

= B a Q + (l - B)o > B d 

J=o 


Since B < 1.0 


and 


Then, 


lim 
k -> » 


8*3 2 

O 


= 0 


for 


| B | < 1.0 


k-1 

lim ) B J = for I B I < 1.0 

k -»» {-> 1 " B 

J=o 


lim E 
k -» 00 



- (1 ' B) ° 2 (rb) - ° 2 


The variance of the estimation is determined by 

A 2 

variance of a =0 
o 


2 

variance of a ^ = E 

The variances for several values of 

A 2 

variance of a ^ = 



k are 


(1 - B) 2 2 cr 4 
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2 4 


variance of =• 2(1 + B 2 )(l - B) a 


variance of a^ 2 = 2(1 + B 2 + B^) (l - B) 2 0 J+ 


The general term is 


k-1 


variance of cr^ 2 *2(1 ■ B) 2 a^ ) 


I 

J=o 


The limiting value is 


lim variance of 
k -> » 



2(1 - B) 2 a^ 



1 A 2(1 - B) 
1 + B 


4 

a 


This equation checks with that obtained by letting A = 1.0 in the 
general equation (5-9)* 

As was found in the estimation of the mean, the limiting value 
of the variance of the estimate can be made as small as desired by 
making the estimation constant, B, closer to 1.0. Since the estimation 
of the mean is used in the es+ tion of the variance, the constant A 
also has an effect on the limiting value of the variance of the 
estimated variance. Again, A should be near 1.0 in order to make the 


variance of the estimate small. 



CHAPTER VI 


DECISION-DIRECTED-MEASUREMENT ESTIMATION TECHNIQUE 


Description of the Method 

The three previous sections of this dissertation presenter'. an 
analysis of the estimation of the mean and of the estimation of the 
variance. The proper operation of these estimators requires that the 
correct answer of the decision he known so that the proper estimator 
can he updated. When the system is first operated in a given situation, 
the location of the frame synchronization code is not known. The system 
is required to move the threshold until enough correct decisions can he 
made in order to locate the frame synchronization code. A form of 
"Dec:! sion-Directed-Measurement" similar to that described hy Lindenlaub 
and Mix (Ref. P* 13) is used to control the system during the search 
for the frame synchronization code. This scheme is examined, in the 
pages which follow, only for the cases requiring estimates of the mean, 
that is, for equally probable signals. It can be employed for cases 
requiring estimation of the variance, but an analysis of this situation 
is not included here. 

The Decision-Directed -Measurement (DDM) is used as outlined by 
the following sequence: (l) An initial guess is made of the mean, y , 

of the received signal when a binary 0 is transmitted and of the mean. 


x q , of the received signal when a binary 1 is transmitted. (2) A first 


selection, 3 » for the threshold is made according to 

O /V /\ 

x + y 
H o 2 


(6-1) 
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This is used as the threshold in order to decide whether the first 
sample, x^, is a binary 0 or 1. If x^ is judged to be a binary 0, 
y Q is updated using the exponential smoothing equation, 


y x = Ay Q + (1 - A)x x 

and the revised estimate of the threshold, 3 , is 



A 



A 

+ X 

o 


2 


(6-2) 


(6-3) 


If, however, x^ is judged to be a binary 1, x q is updated by use of 


x 1 - Ax q + 'l - A)x 1 


(6-4) 


and the revised estimate of the threshold is 

A A 

P .hUl 
P 1 2 


(6-5) 


As more and more samples are processed, the estimated threshold moves 
toward the optimum location, which is the intersection of p(x|o) and 
p(x|l), and eventually gets close enough to allow so many correct 
decisions that the frame synchronization code may be located. When the 
code is located, the "Learning With Teacher" scheme is employed. 


Analysis, Mathematical 
Some questions which should be answered are: 

1. Does the estimated threshold, in fact, move toward the 


optimum location? 

2. What factors affect the convergence rate of the threshold? 
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3. What is optimum point for initial guess of the threshold 
location? 

To answer these questions, it is helpful to examine some probability 
distributions. Figure 4 shows the probability density of the received 
signal, the initial guesses of x q and y , the corresponding 3 Q and 
the actual means, A q and A^, given by 

a o - V vN °) (6 - 6 


and 



(6-7) 


Since any x i k 3 q is Judged to be a binary 1, and any Xj^ < 3 q is 
Judged to be a binary , the conditional probabilities are 


* e o) 



p(o) 

P ( 

x i| 



J 

r 

M 0 ] 

dx l 


p(D p^I 1 ) 

J. K^l 1 ) 4x i 

p o 


“lC*! - 0 o) 


(6-8) 


and 


p ( x i l x i < s o) 



P(o) p 

(*1 

o) p(l) p 

W 1 ) 



f P 0 . 

p 

-00 V 

x il' 

r 3_ 

3 ) ^ L p ( 

■'ll 1 ) dx l_ 


“A - *i}(6-9) 


where u^(z) is a unit step = 1 for z > o and u^ = o for 
z < o). 

Figure 5 is a plot of - 3 0 ) and p ( x i| x i < P Q )’ 
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For x-^ given by (6-4), the calculation of its probability 
density requires the conditional probability for given by ( 6 - 8 ) . 

Since x o and x^ are independent, the probability density of x^ 
given that x^ > is a weighted convolution of P(^ D j and 
p^j i p^x^^ > (Ref. 11, p. 189 ). Similar techniques are used 

to determine p^y^J with the condition of x^ < 0 q . The shapes of 
p^XjJx^ > and < 0 Q ) are shovn in Figure 6 . The new 

threshold is 


AS o + (1 - A)^ + y o 
= 2 

X Q + Ay© + U - A)x 1 
C 2 


for 


for 


* 1-^0 

\ < Po 


( 6 - 10 ) 


( 6 - 11 ) 


This includes the possibility that x q is not updated if the signal is 

decided to be a binary 0 and that y Q is not updated if the signal is 

decided to be a binary 1. The probability density of 0^ is obtained 

by a weighted convolution of p(x q )> , and p(y Q ) with appropriate 

use of the probability that x^ > P q and the probability that x^ < p Q . 

The process is repeated when the second sample, is received with the 

added complication that 0^ has a probability density. If both x^ 

and are decided to be binary 0 's, the calculation of P 2 

still use x as the estimate of A, . 

o ± 

It can be seen that a mathematical analysis of this problem in 
Hosed form is virtually impossible since it involves repeated 
convolutions of truncated probability distributions. No general 
analysis of a DIM system has been located in the literature. With the 
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A? 0 + (1 - A)B 0 | [ Ax q + (1 - A)A 1 

I Ax q + (1 - A)A C 


I 

A$ 0 + <1 - A)A q 

Figure 6.- Probability density of estimated means after processing 
of first sample. 
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exception of a "hang-up" for the case of no noise, there have been no 
cases postulated or discovered experimentally in this investigation or 
in the literature for binary signals for which the DIM system does not 
converge. The possibility of the "hang-up" is discussed and eliminated 
later in this report. The DIM system is designed so that the estimate 
of the mean of the binary 1 moves to the mean of all signals above the 
threshold and the estimate of the mean of the binary 0 moves to the mean 
of all signals below the threshold. The threshold divides the range of 
received signals into two portions so that the average of the estimates 
of the two means is equal to the boundary between the portions. 

Computer Analysis 

A general purpose digital computer is found to be useful for the 
investigation of some of the characteristics of the system. Since it 
is intended that digital techniques be used in the final hardware, the 
range of inputs to the decision element is converted from analog to 
digital format. Therefore, the range of inputs is separated, in effect, 
into 2 n discrete partitions where n is the number of bits used in 
the digital word. By making use of this partitioning of the input 
range, it is possible to perform a numerical convolution of the 
probability densities on a digital computer. 

For the calculations here, the input range is divided into 
64 levels (n = 6). This value is selected as providing about the 
minimum resolution required and as being small enough to reduce the 
required computer time. The probability densities p(o), p(l), p(x|o). 
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and p(x|l) are selected for the particular test case, calculated for 

each of the 64 partitions, and inserted as inputs to the computer. The 

initial estimates, x and y , are also selected and inserted as 

o o' 

inputs. The computer uses x q and y Q to set up a 64 x 64 array 
representing p(x o > y The indices of the array represent the 
amplitude of the variables and the value stored in a given location 
represents the probability. The computer then uses each possibility 
of input with each possibility of threshold to generate p(x^, y^ j . The 
process is repeated in order to determine p(x^, y^). The average value 
of the threshold (as a function of number of samples) is calculated and 
is used as an indicator of the convergence of the threshold. Appendix 
VI shows the computer program used with some typical numbers as inputs. 

Effect of Choice of x q and y Q on Convergence Rate 

The first characteristic of the system to be investigated is the 
effect of the initial guesses, x q and y Q , on the convergence rate. 
Since 3 q = ^x q + y Q j^2, there are many choices of x q and y^ which 
yield the same 3 q . For this test all of the inputs to the computer 
program except x^ and y Q remain unchanged; x q and y^ are varied 
with the proper relationship so that 3 q remains constant. Figure 7 
shows a plot of the average value of the estimated threshold location 
versus the number of samples processed as a function of x q and y . 
Several other cases have been run on the computer but have not been 
shown here since the results of Figure 7 are typical of those in the 
other cases. It can be observed in Figure 7 that the fastest convergence 
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is obtained when x is as small as possible and y is as large as 

o o 

possible. This seems strange since x q is the estimate of the mean 
of the received signal when a binary 1 is transmitted. However, this 
has been found to be true in all of the cases which have been 
investigated. 

The following explanation is offered for this property. For 
x^ > 0 Q , only x q is updated. From (6-1), 


A - - A 

y = 20 - x 

J o o o 


( 6 - 12 ) 


Using this with (6-4) yields 


Pi » 


X 1 + y o 


Ax q + (1 - A)Xj. + 20 q - x, 
g 


(1 - A) 


f ys \ 


(6-15) 


The change, A0, in the threshold is 


* * h - S o 


(i - A)^ - i o ) 


For x 1 < 0 Q , only y Q is updated. From (6-1), 


A - _ A 

* 20 - y 

o o 'o 


(6-14) 


(6-15) 
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Using this with (6-2) yields 



x 

o 



2 


a f>o - y 0 + Ay 0 Ml - A)^ 
2 



The change, zip, in the threshold is 


(6-16) 


AP - - P 0 

(1 - - y Q ) 

= 2 


(6-17) 


If p Q is less than the optimum location of the threshold, AP should 
he as large a positive value as possible. According to (6-lk) and 
(6-17), x q and y Q should both be as small as possible. Since 
p Q = + y^jj2, one must be large if the other is small. Since the 

received signal is decided to be a binary 1 with greater probability 
than to be a binary 0, ( 6-l4) is applicable more often than (6-17) and 
x q is selected as small as possible without regard to y Q . 

Conversely, if P Q is greater than the optimum location of the 
threshold, AP should be as large a negative value as possible. There- 
fore, x q and y Q should be as large as possible. Since the received 
signal is judged to be a binary 0 with higher probability, (6-17) 
applies more often. Therefore, y Q has more effect and is selected as 
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large as possible. This reasoning gives the same results as had been 
observed from the computer runs; that is, x q should be small and y Q 
should be large. For the case of no prior knowledge of the signal 
characteristics the optimum location of (3 q appears to be the center 
of the input range. This means that x q is the smallest possible 
input value and y Q is the largest possible input value. 

Effect of Other Parameters on Convergence Rate 

This section contains observations of the relationship of the 
convergence rate to other variables, such as: 

1. Separation of the mean, A^, of binary 1, from the mean, A q , 
of binary 0. 

2. The variance of the noise; that is, the variance of p(x|o) 
and p(xjl). 

5. Separation of initial estimate of mean and actual mean. 

A method is devised for measuring the convergence rate. Figure 7 shows 

that, for optimum location of x and y , the movement of the average 

o o 

value of the estimated threshold has the appearance of the exponential 
charging of a capacitor. Although the curve for this system is not 
exactly an exponential, the number of samples required to move 
65.2 per cent of the distance between p Q and the actual threshold is 
called a time constant and is used to compare different systems. 

Figure 8 shows a plot of the time constant versus the deviation 
of the noise for three different values of A Q and A^. The estimation 
factor, A, the initial estimate, P o , of the threshold and the optimum 
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Figure 8.- Time constant of mean of estimated threshold 
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location of the threshold are held constant for this test. The initial 
estimates, x q and y , are located at the points found to be optimum 
= Ij y Q =■ according to the discussion following equation (6-17). 

It had been hoped that the convergence rate could be expressed 
as a function of the estimation constant, A, and of a signal-to-noise 
ratio defined as 

*1 " A o 


SNR = 


(6-18) 


n 


where cr n is the deviation of p(x|o) and p(x|l). Figure 8 shows 
that the convergence rate does not depend on A, A^, A q , and in 

such a simple way. For example, for A^ = 50, A q = 20, cx^ = 6, and 
A = 0.85, 

SNR . 50^J2 . 5 

with Figure 8 showing that 

Time constant - 10.52 samples 
Also for, A^ = ^ 0 , A q - 50 , = 2 , 

SNR . . 5 


(6-19) 


Figure 8 shows that 

Time constant = 2 .47 samples (6-20) 

Thus, it can be seen that the convergence rate is different for two 
cases which have the same signal-to-noise ratio. This investigation 
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is not carried any further due to the amount of computer time required 
to generate each point on the curve and due to the large number of 
curves which are required to determine the effect of A q , A^, 3 q > and 
other variables on the convergence rate. 


Comparison With Learning With Teacher 
The convergence rate for the Learning With Teacher scheme can be 
compared to the convergence rate of the DEM technique for those cases 
shown in Figure 8 . For the case of Learning With Teacher for equally 
probable signals, the threshold is computed by 




A 




( 6 - 21 ) 


Since x, and y, are estimated by i .cursive equations with the same 
time constant, 3 i will have undergone one time constant when both 
and y have undergone one time constant. It takes twice as many 

K. 

samples for both x 4 and y,_ to undergo one time constant so that the 

J a 

time constant (measured in "number of samples") of 3 ^ is twice that of 

either x or y, . For the situation shown in Figure 8 , A is equal 
J K 

to O. 85 . The time constant for either x^ or y^ is 

- * 1 
n s In A 


-1 

= In (O. 85 ) 

= 6.15 samples ( 6 - 22 ) 


The time constant of 3^> the threshold, is 

2 n g = 12.3 samples 


( 6 - 23 ) 
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By comparing this time constant to those in Figure 8 , it can he seen 
that the DDM technique converges faster than the Learning With Teacher 
scheme for small separation of A q and and that the Learning With 

Teacher scheme is faster for large separation of A q and A^ . These 
observations deal only with the situation shown in Figure 8 . Curves 
similar to those in Figure 8 for other situations would be necessary 
in order to make a general comparison between the Learning With Teacher 
method and the DBM technique. 

Performance of Noiseless System 

For certain selections of x and y and a noiseless system 

o o 

(Ref. 3, p. 37), it is possible that the estimated threshold does not 
converge to the proper value. As an example, consider the following 
case: 

p^Jo) = 1.0 6^ - 20) 
p^il 1 ) = 1.0 S(x ± - 30) 

with x q = 15 and y o = 10. This case is illustrated in Figure 9 . 
Because of the convergence of the estimation technique, x^ nr > ves toward 
the mean of the signals above the threshold. For the case in Figure 9 
and for p(o) = p(l) = 0 . 5 , 


(6-24) 
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There can never be any received signals falling below the threshold so 
that y Q is never updated. Therefore, the limiting value of the 
threshold is 


lim 
k -»°° 



lim 
k -*<*> 



lim 
k -» “ 



1 


= 2 



+ 


lim 

k 




( 6 - 25 ) 


Therefore, the threshold has not converged to the proper value. 

This possibility is eliminated if x q and y Q are chosen as 
previously discussed in this section (x q as small as possible and y Q 
are large as possible]. This insures that y Q is always above the 
final location of the threshold and that x q is always below the 
threshold location. In order to prevent the possibility of a "hang-up" 
situation occurring due to a change in signal characteristics, it is 
necessary to reset x q to the smallest possible value and y o to the 
largest possible value each time the location of the synchronization 


code is lost. 


CHAPTER VII 


CALCULATION OF THE PROBABILITY OF ERROR 


The probability of error can be determined for a system using 
an estimated threshold in place of the optimum threshold. For the 
case of equally probable signals (K = l) the optimum location of the 
threshold in the Bayes sense is 


P = 


*1 


+ A 

o 

2 


If it is assumed that the adaptive detector has been moved sufficiently 
close to the optimum threshold by the Decision-Directed-Measurement 
technique that the Learning With Teacher scheme can be used and if it 
is assumed that the detector has processed a very large number of 
samples from a stationary environment, the asymptotic probability of 
error can be investigated. Since the estimates of the means are random 
functions, the threshold and the probability of error are random 
functions. The average probability of error can be obtained by using 
the probability of error as a function of the threshold location and 
the probability density of the location of the threshold. 

The threshold is located by the adaptive system at 


As discussed in Chapter III (Estimation of the Mean), the variables 
and y^ both have a Gaussian probability density with 
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variance of x =. variance of y 
00 v 00 



E 0O = A i 

and 

E {/»} * A o 


The two variables x m and y^ are independent because the noise has 
been assumed to be white and Gaussian. The threshold, 0^, has a 
Gaussian distribution with a mean of 


and 





+ A 

o 

2 


variance of 



The probability of error as a function of the threshold location is 


prob. of error (0 ) = 


2^2n c 

n 



r , . ,2 


(x - A } 

/ exp 

l 0 J 

“ 4 

J 0 

2c 

00 

_ n J 


dx 


exp 


(x - A l ) 2 

dx 

2a 2 

L n J 



The probability density of the threshold is 
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The average value of the probability of error is 

E<^prob. of error^> = / p(p^) [j?rob. '>f error (p^^J dp^ 


The average probability of error has been calculated on a digital 
computer for several values of A and for several values of SNR 
■where 

A, - A 

SNR = -i- 


The following table shows the results of these calculations: 


Average probability of error 


SNR 

A = 

= C 

3.85 

A = 

- 0.90 

A = 

= 0.95 

A = 

* 1.0 


3-33 

9.25 

X 

io “ 2 

9.07 

X 

10- 2 

8.7 

X 10-2 

4.74 

X 

10- 

2 

4.0 

4.53 

X 

10-2 

4.37 

X 

10-2 

4.1 

x 10“ 2 

2.27 

X 

10- 

2 

5-0 

1.28 

X 

10 ”j" 

1.21 

X 

10-2 

1.1 

x 10-2 

6.21 

X 

10- 


6.67 

9.56 

X 

10- 4 

8.6 

X 

10“ 4 

7.5 

x 10- 4 

4.30 

X 

10- 

4 

10.0 

7.93 

X 

10-7 

6.42 

X 

10“7 

5.0 

x 10"7 

2.87 

X 

10- 

7 

20.0 

7.8 

X 

10-23 

3.7 

X 

10-23 

1.69 

x 10-23 

7.62 

X 

10‘ 

24 


Figure 10 shows a plot of the results for A = O.85 and A = 1.0. The 
results for A = 1.0 correspond to the optimum detector for this 
situation. All other values of A cause the average probability of 
error to be higher than for the case of A = 1.0. It should be noted 
that Figure 10 gives the average probability of error. The actual 
probability of error is a random function which has a minimum given 
by the curve for A » 1.0. The value of A can be chosen as close to 
1.0 as desired in order to reduce the probability of error, but 
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increasing A causes an increase in the reaction time of the 
estimation equation as discussed in Chapter III. 

Since Patrick and Costello (Ref. l) only make their calculations 
for an infinite number of samples used to compute the sample mean, the 
only comparison between these results and those of Reference 1 that can 
be made is for A = 1 with equally probable signals. At this point the 
estimates have zero variance and the adaptive detector has the same 
probability of error as the optimum detector. The additional error 
found in Reference 1 is due to an unsymmetrical bias caused by non- 
equally probable signals. No attempt was made by 1 -’trick and Costello 
(Ref. 1) to compensate for the effect of the nonequally probable 
signals . 

A calculation of the probability of error of the DIM system with 
nonequally probable signals and with A not equal to 1.0 would show 
the ability of the DIM system to properly locate the threshold. This 
analysis would be more complicated than the analysis shown in 
Reference 1 because values of A other than 1.0 have the effect of 
using less than an infinite number of samples in the estimation process 
and because an estimation of the variance is included in the system 
proposed here to reduce the additional error due to nonequally probable 
signals. This estimate of the variance is also a biased estimate when 
operating in conjunction with the DIM technique for the same reason 
(p. 15) that the estimates of the means were biased. Since the input to 
the estimator of the variance is not Gaussian in the DIM mode, the 
calculation of the variance of the estimated variance shown in Chapter V 
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does not apply. This calculation of the probability of error has not 
been included as a part of this investigation. 

An example can illustrate a situation in which the adaptive 
detector would give a lower probability of error than a degraded 
rp.imum detector. Assume than an optimum detector has been constructed 
and is operating in an environment of zero-mean, white, Gaussian noise. 
Let an enemy in the neighborhood of the transmitter begin to transmit 
a continuous sequence of signals which correspond exactly to the 
binary 0 signal and which are exactly in synchronization with the data 
bits. Before the enemy began to transmit, the equation, (2-1 ), 
implemented by the optimum detector was 

u = W T fS, - 3 ^ - b 
l 1 oj 

The equation implemented by the optimum detector after the beginning 
^f transmission of the enemy is 

u » W T (S, - S ) - b + A ’S T fS n - S ) 

1 OJ c ol ( 1 OJ 

where A ’ is the channel gain of the channel from the enemy trans- 
mitter to the receiver. Since A ’S^Ys. - S ^ is a constant and is 

c oV, 1 o/ 

present for both binary 0 and binary 1 signals, the additional signal 
T 

A c' G o( S l " S o) ^as a PP e&rance a nonzero ^ean of the noise. This 
would cause the optimum detector to be operating with its threshold 
located at a nonoptimum location. 

Figure 11 shows a plot of the probability of error of an optimum 
detector as a function of the location of the threshold. The abscissa 
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Figure 11,— Comparison of optimum and adaptive detectors 
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is expressed in terms of the deviation of the noise. For instance, if 
the signal-to-noise ratio were 5.0 and if the mean of the noise were 
0.1 a , the degraded optimum detector would have probability of error 
equal to 0.00643. If the mean of the noise is ?ero, the optimum 
5 sector has a probability of error of 0.00621. 

An adaptive detector operating in this same situation has an 
average probability of error which does not change as the mean of the 
noise changes if sufficient time is allowed for any transients to 
disappear. The probability of error of the adaptive detector is higher 
than that of the optimum detector when the optimum detector is using 
the optimum threshold. As the mean of the noise increases, the 
probability of error of the degraded optimum detector increases while 
the probability of error of the adaptive system remains cons- ant. 

Figure 11 shows the points where the probabilities of error of the two 
systems are equal for a given signal-to-noise ratio and for a given 
recursive constant. a signal-to-noise ratio of 5 . 0 , the adaptive 

system using A = 0.85 has a lower probability of error than a 

m 

degraded optimum system for values of A^SfS^ - S Q j greater than 
approximately 0.55^ n - For values of A greater than O. 85 , the point at 
which the two systems have equal probability of error is decreased. As 
the signal-to-noise ratio increases the point at which the two systems 
have equal probability of error also decreases. This fact can be shown 
from curves similar to Figure 11 for other signal-t 5 -noise ratios, but 


have not been included here. 



CHAPTER VIII 


CONCLUSIONS AND FUTURE WORK 


The results of this investigation show that the system proposed 
here is capable of acting as an adaptive detector. The system requires 
an estimate of the mean and an estimate of the variance of a sequence 
of random numbers. The mean is estimated by 


\ - + - A)x * 


and the variance is estimated by 



Bo k-i" + 


2 . (1 + A)(l - B) 


pk - u v 


If the input data have a Gaussian distribution with mean of "a" and 
2 

variance of "a " , the estimated variance has a variance of 


lim variance of 

k -» oo 



2 + B ( 1 - A ) 2 
2 

1 - A B J 


4 

o 


The accuracy and time response of each estimator can be varied by the 
choice of constants A and B. The frame synchronization code is used 
as the teacher in a ’’Learning With Teacher" technique. A Decision 
Directed Measurement technique is used when the frame synchronization 
code is not available. 

It has been found that in the recursive estimate of the mean, 
the more accurate the estimate, the slower the convergence. The 
optimum location of x q and y Q when using the Decision Directed 
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Measurement technique is found by a computer study to be as small 

as possible and y Q as large as possible. This selection of and 

y Q is found to prevent the possibility of a "hang-up" when there is no 
noise. The asymptotic, average probability of error for the adaptive 
system using only the estimates of the means in the Learning With 
Teacher mode is found to be equal to that of the optimum detector for 
A = 1.0 and greater than the optimum detector for A < 1.0. However, 
a change in the environment can cause the adaptive detector to have a 
lower probability of error than an optimum detector, which cannot track 
the changes. 

This investigation has by no means completely analyzed the 
proposed system. Some of the areas which offer possibilities for future 
work are discussed here. One of the most important areas for future 
work is the construction of hardware to perform the functions discussed 
in this dissertation. R. G. Brown (Ref. 4) mentions other techniques of 
estimation of the mean which are essentially higher orders of the 
technique used here. It would be interesting to attempt to use some 
of the other techniques and to compare their results to those of the 
exponential smoothing method. The time response of the estimation of 
the variance is unknown and should be determined, but it will be 
difficult to determine due to the nonlinearity of the estimation 
technique. The probability of error of the adaptive system in the 
Learning With Teacher mode for K / 1 also should be determined. The 
operation of the "Decision-Directed-Measurement" technique also has 
some areas which require more investigation. The effect of all the 
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system variables on the convergence rate of the system needs to be 
determined. All of the techniques discussed here need to be investi- 
gated when operating in an environment of correlated noise and correla- 
tion between adjacent data samples. The probability of error for this 
adaptive system in the DDM mode should be calculated. The operation 
of the DDM mode when the variance is estimated also needs to be 


investigated . 
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APPENDIX I 


METHOD OF ESTIMATION OF 


The purpose of this appendix is to show that 


2 

a 

n 

variance of ^ 

AS + 
c 0 


: ( s l- 

S o)) 

A c e| 

(Vi + ") 

l T ( S l - S o) 

M 

(V 

! o + *] 

'( S 

'"o' 

CO 

1 

r — 1 


(A-l-1 


The adaptive detector computes estimates of three properties of its 
input. These are 

e{x|w = A cSl + !»} * * N) T ( Sl - S e | 

E {X|W . A c S o + N} - e[(A 0 S o + - S o | 


and 


variance of <{x|W = A c S q + N^> = variance of |^ A C S 0 + n) T {S^ “ S o)j 

The remaining portion of this appendix shows that the numerator of 
equation (A-l-l) is 


variance of 


{(Vo + N f( S 1 • S o| - "n 2 ( S l - S o f(! 


S n - S] 

1 o ) 


and that the denominator is 


E (( A A + N ) T ( S 1 - S o)j - E j(Vo + S ) T ( S 1 - S o)} - A =( S 1 - S o) T C 


rs 1 - s 


80 



8l 


The numerator is rearranged to yield 


- E' 


variance of + N " ®o) 

_( A o S o * »f( s 1 - S o) - E {(Vo + N ) T ( S 1 - S o)| 

( A o S o + *f(* 1 - S o) - E [( A = S o + N f( S 1 - S o)}]} 

Equation (2-7b) with e = 0 is used to reduce the complexity of the 
above equation, so that 

variance of j(A c S 0 + N f - si 
‘ E {E S - 5)T ( S 1 - S o)] T B N - S)T ( S 1 - S o)]} 

- E{(H - 5) T (n - 5)} (S x - s 0 ) T ( Sl - S 0 ) 


* °n ( S 1 - S o) ( S 1 - S o) 


Identical results are obtained if variance of l(^ c ^ + Nj ^ - S Q j 


{<• 


(A-l-2) 


is computed. 

The denominator of equation (A-l-1) is computed using (2-7a) and 
(2-7b) with e = 0, which yields 


{e 


^Vl + N ) ( S 1 " S o 


“ E (Vo + N ) ( S ’ - S - 


1 oj 


- A S T S - A S T S + N T fS 1 - S ) - A S T S + A S T S 
c 1 1 clo ^ 1 oj col coo 




N ( S 1 - S o) 
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T T T 

= A S/S, - S. S - S S, - S 
c 1 1 lo ol o 


■SI 






(A-1-3) 


Equation (A-l-l) results from the division of Equation (A-l-2) hy 
Equation (A-l-3) as shown hy 


variance of <^A c S q + N 

) T ( s i - s 0 ; 

i) 

E { 

( A A + »f p i - s o) 

|- E {( 

Vo + n ) t i 

: s i ■ s o)} 


°n 2 ( S l - S o) T f S l - S o) 
A c( S l - S of( S l - S o) 



APPENDIX II 


ANALYSIS OF THE EXPONENTIAL SMOOTHING TECHNIQUE 

This appendix is a derivation of some of the properties of the 
estimation of the mean "by the exponential smoothing technique. The 
results shovn here were published in Reference 4 by R. G. Brownj 
however, some of the results here are derived in a different manner. 

\ = *Vl + (1 - A )x k i k » 1,2,3 (A-2-1) 

where 


x^ - kth estimate of the mean 
x^ = kth data sample 
A = recursive constant $ A < 1.0 
x Q * initial guess of mean 

This equation can be compared with that in Reference 4 if (l - A) is 
set equal to a. 


Derivation of the Mean of the Estimation 
The general term of the estimation equation is rearranged by 
inserting an expression for x^ ^ into the expression for x^, 
inserting an expression for x^ g, and continuing until x q is reached. 
The resulting expression is 
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\ - A k X Q + (l - A) 


*11 + *Vl + K \-2 * ■■■ * 


(A-2-2) 


The mean value of x^ is 

+ (1 - A) 


*&} = E { a1 \} + U - a) E^) + AE{x k . 1 ) + • • • + A*- 1 E^) 

The random function from which the samples, x.^, are taken is assumed 

2 

to he stationary with a mean of "a" and a variance of "c " so that 

E { x i) - a 

The meaji of x^ can be rewritten to yield 

- 4%) + u - A) E(x i ) [l + A + ••• + A*' 1 ] 


= A 1 ^ + (1 - A)a [l + A + A 2 + • • • + A k_1 3 (A-2-5) 


Since A <1.0, 


lim A^x = 0 
k ->» 0 


and 


lim 
k -* 00 


(l + A + A 2 + ♦ • • + A*’ 1 ) = 


Therefore, 


E (^> * a(1 ' A) (rhc) 


= a 


(A-2-4) 
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Derivation of the Variance of the Estimation 
The variance of x^ is calculated by using 

variance of x^ = E - E 2 (A-2-5) 

The first two terms axe 

variance of x q - E {x q 2 } - E 2 <£x q ^ 

=x 2 -x 2 =0 (A-2-6) 

o o v 

and 

variance of x^ = E ^A^x q 2 + 2A(l - Ajx^x^ + (l - A) 2 ^^ 

• [w + E { (1 - ^}T 

= A 2 x q 2 + 2A(l - A)ax Q + (l - A )* (a 2 + o 2 ) 

- A 2 x q 2 - 2A(1 - A)ax o - (l - A) 2 a 2 
* (l - A ) 2 o 2 (A-2-7) 


The variance for several values of k has been calculated by the same 
technique and is presented in the following table: 
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The general expression of the variance of x^ is written from the 
table by inspection to yield 


variance of 


\ - (1 - A ) 2 o 2 ^ [a 2 ] 1 
i=o 


(A-2-8) 


As k approaches infinity. 


lim variance of 

k -»0D 


\ = 


(1 - A ) 2 o 2 



lim variance of x^ = I ^ ~ - j^ o 2 (A-2-9) 

k ^ ® 


Derivation of the Time Constant of the Estimation 
Since the estimation equation must also react to step changes in 
the mean of the incoming data, it is desirable to determine the time 
required to respond to a step change. The estimation equation is 
analyzed as if it were a filter by the use of the z-transform method 
(Ref. 12). The impulse response of the following equation is found: 

* AVl + (1 - A)x k 
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Let 

A _ 

X -l = ° 

x — 1 
o 

x i =■ 0 for i / 0 

This set of conditions determines the response of the estimation 
technique to an input of a unit impulse at t = 0. From the definition 
of the z- transform (Ref. 12, p. 145) 

00 

- X V* 

k=o 

X(z) = (1 - A)z° + A(l - A) z” 1 + A 2 (l - A)z" 2 + ••• 

X(z) = (1 - A) (l + Az” 1 + A 2 z" 2 + ...) 

X( 2 ) = (l - A) T 

1 - Az 

X( z) = ~ 4^ (A-2-10) 

The Laplace transform which corresponds to the z-transform given above 
is 

H(s) = i 1 . --*> ■ 

S - rf Zn A 


(A-2-11) 


APPENDIX III 


/ 

DERIVATION OF THE LIM VARIANCE OF fa - \ ml )* 

k 


^ 2 

The first moment of - x^_.jJ I s 

E j(*k ■ Vlfj = E j*k ■ 2x k Vl + Vl j 

- } - 2E {\> E (vi} + E (V i 2 ) 


This step can he made since ^ and x^ are independent, 


E ( x * ' v 


-.)•} • 


a 2 + c 2 - 2a lim E& 

k -+» J k -»<*> 


+ lim E 1 


(\. 1 2 > 


= a 


2 + ct 2 - 2a (a) + a 2 + (^7^ 


1 + 


1 - A 


l + A 


2 a 


1 + A 


The second moment is 

E {k - Vi)*} 


k 4 - 4 Vi 


/- 2 a 2 j a 3 

+ 6 *k Vl ‘ ^ *k Vl 


+ Vi 
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As discussed in Chapter III, x^ and ^ both have a Gaussian 
distribution with known mean and variance and are independent. 
Substitutions from Appendix VII yield 


E \r* ■ Vi; 


a^ + 6a 2 o 2 + 3a 4 - 4 a^ - 12 a 2 o 2 + 6 


- her - 12 


, 2 2 . ^ fl - A\ 2 2 . ^ 1 - A\ J 

+ 6a c o + 6 ) 0 

fl-A\ 2 .2 , k . £ fl - A\ 

\TTa) & ° +a + H~J 


2 2 
a 0 


+ 3 


1 - AT _> 
1 + A 


3 ^ + 6 ♦ 3 (i^J 

4 r 1 - a ] 2 

L TTJ ] 


12 a 


(1 + A) 4 


The limiting value is 


lim variance of 

k -* 00 L 


{\ - \-if] 




■ ^ K ft** ■ ** 


Urn E 2 ^ - ^.J 2 } 


k -*» 


12 0 


4 0 " 


(1 + A) 2 (1 + A) 2 

8 a k 
(1 + A) S 


APPENDIX IV 


DERIVATION OF LIM E 
k -* » 


6- '•>*) 


{\ *3 ■ ■ {fVl * » - ■»,][* VI s • •*' (■«• v.)j 

5 j^k-l °k-I 2 J + B ^ 1 “ E j*k °k-l J 

Vi) " 2E ^k \-i 2 ] + E {Vi 5 } 


--ABE 


+ A(1 + A)(l - B) 


(1 - A)(l + A) (l - B) 
2 


- 2 E fk 2 vi} 


+ E 


*k Vi1 


(A-4-1) 


Since is independent of x^ and , the expected values of 

the product of these variables can be separated into the product of the 
expected values. Using Appendix VII, Equation (A-4-l) becomes 



ABE 


(vi Vi 2 } 


+ B(l - A)a E(a. 




♦ & * [d - A) (.3 + 5a a 2 ) 

+ (5A - 2) (a 2 + a 2 ) E 
+ (1 - 5A)(a) E + A E 


(A-4-2) 


'The technique for finding the limit as k approaches infinity 
of this recursive equation is the same as that used in the main body of 
this report for the variance of the estimated variance. The index k 
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is set equal to n vhere n has a value large enough so that all 
terms on the right hand side of Equation (A-4-2) except ^ 

have become infinitesimally close to their limiting values. Using 
Appendix VII and Equations (A-2-4) and (A-2-9), Equation (A-4- 2) 
becomes 



ABE 


(Vx Vx 2 } 


+ (1 + A)(l - B) 


+ B(1 - A)a a 
(l - A)(a^ + 3a <r 2 ) 

+ (3A - 2) (a 2 + cr 2 )a + (l - 3A)(a)^a 2 + 


+ A 


ABE 


[vi Vi 2 } 


+ (1 - A B)a a c 


(A-4-3) 


Several terms are calculated in order to recognize the series being 
generated . Let 


E 


{ A A 

x , a . 
n-1 n-] 


= P 


Then 


A B(P) + (l - A B)a a 


{* 


I A A 2 

E <Vi Vx 


(* 


„ J /\ A 

E ' X n+2 a n+2 


( s - ■ 

"J = A 2 B^(P) + (1 + A B)(l - A B)a a 2 
^ = A 5 B 5 (P) + (l + A B + A 2 B 2 )(l - A B)a 0 2 
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and 


E Vi Vi 


A i+1 B i+1 (P) + (1 - A B) 




(AB) J (A-4-4) 


J=o 


Since A < 1 and B < 1, (A-4-4) becomes 

i 1 ?. E (*« Vt 2 ) - ^ E {s \ 2 } - 


(1 - A B)a o‘ 
1 - A B 


- a a 


(A-4-5) 


The convergence of this series has been verified by calculating 
the exact expression for the series for k = 0, 1, and 2 but has not 
been included because of its length. From these expressions it is 
possible to recognize the general expression for the coefficients of 
an terms in the expression. It is found that the limit of coefficients 
of all terms approached zero as k approached infinity except for the 

p 

coefficient of a a . This coefficient is found to approach 1.0. 



APPENDIX V 


DERIVATION OF LIM E 
k -¥ oo 


e {V s n 2 } * 


E {i A vi + ( x ■ a ) x 3 2 [ b 3 ^-i 2 

+ (l + ffi , ^B ) (Xk . K J 

A * B E {^-i 2 Vx 2 ) + 24 ^ - A)i E (vi \-x 2 ) 

+ B(1 - A) 2 ^a 2 + a 2 J e(Vi 2 } 

+ U ,. + £L C ^,.— B 2 + 6a 2 o 2 + 3<A) (l - A) 2 

+ E ^ (a^ + 3a c 2 )(- 2 + 6 a - ^A 2 ) 

+ E {x k _ 1 2 } U 2 + 0 2 )(l - 6A + 6A 2 ) 

+ E {\.i 3 } (a) (®* - ^ 2 ) + E (vi 4 ) < a2 [) 


(A-5-1) 


Since x^ is independent of x^_-, and the expected 

value of the product of these variables is separated into the product 
of the expected values in the above expression. 

The technique for finding the limit as k approaches Infinity 
of this recursive equation is the same as that used in Appendix IV. 

The index k is set equal to n where n has a value large enough 
so that all terms on the right hand side of equation (A-5-l) except 
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E ^ i J have become infinitesimally close to their limiting 
values. Using Appendix VII, Equation (A-5-l) becomes 


{V 3 n 2 ) - aZ B E fn-l 2 S n-1 2 ) + 2A B(1 - A)a " ° 2 

+ B(1 - A) 2 (a 2 ♦ a 2 ) a 2 ♦ t 1 ♦*»*- . »> (a 4 ♦ 6a 2 a 2 


+ 3o J 0(l - A) 2 + (a 4 + 3a 2 a 2 )(- 2 + 6 a - 4a 2 ) 
+ ^a 2 + TT^ j a2 j( &2 + - 6A + 6A 2 ) 

* if * 3 [Vh] * ° 2 ) (a ' ^ 

A 2 B E |x n _ 1 2 + (l - A 2 b)» 2 o 2 


♦(»* + 6 rri a2 ° 2 + 5 


= A 2 B E 


\ 2 2 

a a 


+ (jl - A)(l - A 2 b) + (1 - A) 2 (l - B)J (A-5-2) 

E<x 2 a 2 ) is set equal to an arbitrary constant Q, and several 
I n-1 n-1 J 

terms are calculated in order to recognize the series being generated: 

E fx 2 o A = Q 
I n-1 n-1 J 

E |x n 2 = A 2 B(Q) + (l - A 2 B)a 2 a 2 

+ J-Ta [ (l “ A )(l ' A 2 

+ (1 - A) 2 (l - B)] 
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{* 


E (x 2 a 2 i • A 4 B 2 
n+1 n+1 


2 j » a 4 b 2 (q) + (1 + a" b; 

I4. r 

+ m | (1 ■ A) ( 1 - A 2 b) + (1 - A) S (l 


■ 2 b) 


(l - A 2 b)s 2 


-5 


The general term can be recognized to be 


* {y w} • a2(j+i) b j+i <« * 


(l - A 2 B) 


2 «V2 2 

a a 


+ iVa D 1 - A'C 1 - A 2 b) 


+ (1 - ahi - b) 


0] I (a 2 b) 1 

1=0 


(A-5-3) 


Since A B < 1.0, 


lim E 
k -» » 


C 2 A 2 I . (*A< 2 /V 2 I 

(> -/?. * (vj Vjj 

- 777^ [l 1 ' ** B)a 2 c S + j 4 a C 1 * A >(l ‘ * 2 B) 


+ (1 - A) 2 (l - B)] 


2 2 . 1 + 

a o + <r 


1 - A + (1 - kf (1 - B) 
} +A (1 + A)(l - A 2 B)J 


(A-5-M 


The limit of this sequence has been verified by the same method 
of verification discussed in Appendix IV. 
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DO 23 J=*l, 64- 
FAC=PXY ( I / J ) 

I F (FAC- 0.0) 23, 23, §7 
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APPENDIX VII 


MOMENTS OF A GAUSSIAN VARIABLE 


If x is a probabilistic variable having a 
distribution with mean of "m" and variance of 
four moments of x are (Ref. 11, p. 162): 

E^x^> - m 

2 \ 2 . 2 
E\x j - m 


+ v 


E + 3 mv 2 

,fk\ 4 ^ , 2 2^,4 

J - m + 6m v + yt 


Gaussian probability 
■ 2 ", the first 
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